Fuqiang LI Tongzhuang ZHANG Yong LIU Guoqing WANG
The ignored side effect reflecting in the introduction of mismatching brought by contrast enhancement in representative SIFT based vein recognition model is investigated. To take advantage of contrast enhancement in increasing keypoints generation, hierarchical keypoints selection and mismatching removal strategy is designed to obtain state-of-the-art recognition result.
Qiao YU Shujuan JIANG Yanmei ZHANG
Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.
Leida LI Yu ZHOU Jinjian WU Jiansheng QIAN Beijing CHEN
Image retouching is fundamental in photography, which is widely used to improve the perceptual quality of a low-quality image. Traditional image quality metrics are designed for degraded images, so they are limited in evaluating the quality of retouched images. This letter presents a RETouched Image QUality Evaluation (RETIQUE) algorithm by measuring structure and color changes between the original and retouched images. Structure changes are measured by gradient similarity. Color colorfulness and saturation are utilized to measure color changes. The overall quality score of a retouched image is computed as the linear combination of gradient similarity and color similarity. The performance of RETIQUE is evaluated on a public Digitally Retouched Image Quality (DRIQ) database. Experimental results demonstrate that the proposed metric outperforms the state-of-the-arts.
Lei ZHANG Qingfu FAN Wen LI Zhizhen LIANG Guoxing ZHANG Tongyang LUO
Existing moving object's trajectory prediction algorithms suffer from the data sparsity problem, which affects the accuracy of the trajectory prediction. Aiming to the problem, we present an Entropy-based Sparse Trajectories Prediction method enhanced by Matrix Factorization (ESTP-MF). Firstly, we do trajectory synthesis based on trajectory entropy and put synthesized trajectories into the trajectory space. It can resolve the sparse problem of trajectory data and make the new trajectory space more reliable. Secondly, under the new trajectory space, we introduce matrix factorization into Markov models to improve the sparse trajectory prediction. It uses matrix factorization to infer transition probabilities of the missing regions in terms of corresponding existing elements in the transition probability matrix. It aims to further solve the problem of data sparsity. Experiments with a real trajectory dataset show that ESTP-MF generally improves prediction accuracy by as much as 6% and 4% compared to the SubSyn algorithm and STP-EE algorithm respectively.
Yuhu CHENG Xuesong WANG Ge CAO
A multi-source Tri-Training transfer learning algorithm is proposed by integrating transfer learning and semi-supervised learning. First, multiple weak classifiers are respectively trained by using both weighted source and target training samples. Then, based on the idea of co-training, each target testing sample is labeled by using trained weak classifiers and the sample with the same label is selected as the high-confidence sample to be added into the target training sample set. Finally, we can obtain a target domain classifier based on the updated target training samples. The above steps are iterated till the high-confidence samples selected at two successive iterations become the same. At each iteration, source training samples are tested by using the target domain classifier and the samples tested as correct continue with training, while the weights of samples tested as incorrect are lowered. Experimental results on text classification dataset have proven the effectiveness and superiority of the proposed algorithm.
Lina GONG Shujuan JIANG Qiao YU Li JIANG
Heterogeneous defect prediction (HDP) is to detect the largest number of defective software modules in one project by using historical data collected from other projects with different metrics. However, these data can not be directly used because of different metrics set among projects. Meanwhile, software data have more non-defective instances than defective instances which may cause a significant bias towards defective instances. To completely solve these two restrictions, we propose unsupervised deep domain adaptation approach to build a HDP model. Specifically, we firstly map the data of source and target projects into a unified metric representation (UMR). Then, we design a simple neural network (SNN) model to deal with the heterogeneous and class-imbalanced problems in software defect prediction (SDP). In particular, our model introduces the Maximum Mean Discrepancy (MMD) as the distance between the source and target data to reduce the distribution mismatch, and use the cross-entropy loss function as the classification loss. Extensive experiments on 18 public projects from four datasets indicate that the proposed approach can build an effective prediction model for heterogeneous defect prediction (HDP) and outperforms the related competing approaches.
Heling CAO Shujuan JIANG Xiaolin JU Yanmei ZHANG Guan YUAN
Fault localization is a necessary process of locating faults in buggy programs. This paper proposes a novel approach using dynamic slicing and association analysis to improve the effectiveness of fault localization. Our approach utilizes dynamic slicing to generate a reduced candidate set to narrow the range of faults, and introduces association analysis to mine the relationship between the statements in the execution traces and the test results. In addition, we develop a prototype tool DSFL to implement our approach. Furthermore, we perform a set of empirical studies with 12 Java programs to evaluate the effectiveness of the proposed approach. The experimental results show that our approach is more effective than the compared approaches.
Qiao YU Shujuan JIANG Yingqi LIU
Memory leak occurs when useless objects cannot be released for a long time during program execution. Memory leaked objects may cause memory overflow, system performance degradation and even cause the system to crash when they become serious. This paper presents a dynamic approach for detecting and measuring memory leaked objects in Java programs. First, our approach tracks the program by JDI and records heap information to find out the potentially leaked objects. Second, we present memory leaking confidence to measure the influence of these objects on the program. Finally, we select three open-source programs to evaluate the efficiency of our approach. Furthermore, we choose ten programs from DaCapo 9.12 benchmark suite to reveal the time overhead of our approach. The experimental results show that our approach is able to detect and measure memory leaked objects efficiently.
Yong ZHANG Wanqiu ZHANG Dunwei GONG Yinan GUO Leida LI
Considering an uncertain multi-objective optimization system with interval coefficients, this letter proposes an interval multi-objective particle swarm optimization algorithm. In order to improve its performance, a crowding distance measure based on the distance and the overlap degree of intervals, and a method of updating the archive based on the acceptance coefficient of decision-maker, are employed. Finally, results show that our algorithm is capable of generating excellent approximation of the true Pareto front.
Guan YUAN Mingjun ZHU Shaojie QIAO Zhixiao WANG Lei ZHANG
With the extensive use of location based devices, trajectories of various kinds of moving objects can be collected and stored. As time going on, the volume of trajectory data increases exponentially, which presents a series of problems in storage, transmission and analysis. Moreover, GPS trajectories are never perfectly accurate and sometimes with high noise. Therefore, how to overcome these problems becomes an urgent task in trajectory data mining and related applications. In this paper, an adaptive noise filtering trajectory compression and recovery algorithm based on Compressed Sensing (CS) is proposed. Firstly, a noise reduction model is introduced to filter the high noise in GPS trajectories. Secondly, the compressed data can be obtained by the improved GPS Trajectory Data Compression Algorithm. Thirdly, an adaptive GPS trajectory data recovery algorithm is adopted to restore the compressed trajectories to their original status approximately. Finally, comprehensive experiments on real and synthetic datasets demonstrate that the proposed algorithm is not only good at noise filtering, but also with high compression ratio and recovery performance compared to current algorithms.
Lei ZHANG Guoxing ZHANG Zhizheng LIANG Qingfu FAN Yadong LI
The traditional Markov prediction methods of the taxi destination rely only on the previous 2 to 3 GPS points. They negelect long-term dependencies within a taxi trajectory. We adopt a Recurrent Neural Network (RNN) to explore the long-term dependencies to predict the taxi destination as the multiple hidden layers of RNN can store these dependencies. However, the hidden layers of RNN are very sensitive to small perturbations to reduce the prediction accuracy when the amount of taxi trajectories is increasing. In order to improve the prediction accuracy of taxi destination and reduce the training time, we embed suprisal-driven zoneout (SDZ) to RNN, hence a taxi destination prediction method by regularized RNN with SDZ (TDPRS). SDZ can not only improve the robustness of TDPRS, but also reduce the training time by adopting partial update of parameters instead of a full update. Experiments with a Porto taxi trajectory data show that TDPRS improves the prediction accuracy by 12% compared to RNN prediction method in literature[4]. At the same time, the prediction time is reduced by 7%.
Yong WANG Zhiqiu HUANG Rongcun WANG Qiao YU
Spectrum-based fault localization (SFL) is a lightweight approach, which aims at helping debuggers to identity root causes of failures by measuring suspiciousness for each program component being a fault, and generate a hypothetical fault ranking list. Although SFL techniques have been shown to be effective, the fault component in a buggy program cannot always be ranked at the top due to its complex fault triggering models. However, it is extremely difficult to model the complex triggering models for all buggy programs. To solve this issue, we propose two simple fault triggering models (RIPRα and RIPRβ), and a refinement technique to improve fault absolute ranking based on the two fault triggering models, through ruling out some higher ranked components according to its fault triggering model. Intuitively, our approach is effective if a fault component was ranked within top k in the two fault ranking lists outputted by the two fault localization strategies. Experimental results show that our approach can significantly improve the fault absolute ranking in the three cases.
Zhaolin LU Ziyan ZHANG Yi WANG Liang DONG Song LIANG
This letter presents an image quality assessment (IQA) metric for scanning electron microscopy (SEM) images based on texture inpainting. Inspired by the observation that the texture information of SEM images is quite sensitive to distortions, a texture inpainting network is first trained to extract texture features. Then the weights of the trained texture inpainting network are transferred to the IQA network to help it learn an effective texture representation of the distorted image. Finally, supervised fine-tuning is conducted on the IQA network to predict the image quality score. Experimental results on the SEM image quality dataset demonstrate the advantages of the presented method.
Jun WANG Guoqing WANG Leida LI
A quantized index for evaluating the pattern similarity of two different datasets is designed by calculating the number of correlated dictionary atoms. Guided by this theory, task-specific biometric recognition model transferred from state-of-the-art DNN models is realized for both face and vein recognition.
Qiusheng HE Xiuyan SHAO Wei CHEN Xiaoyun LI Xiao YANG Tongfeng SUN
In order to solve the influence of scale change on target tracking using the drone, a multi-scale target tracking algorithm is proposed which based on the color feature tracking algorithm. The algorithm realized adaptive scale tracking by training position and scale correlation filters. It can first obtain the target center position of next frame by computing the maximum of the response, where the position correlation filter is learned by the least squares classifier and the dimensionality reduction for color features is analyzed by principal component analysis. The scale correlation filter is obtained by color characteristics at 33 rectangular areas which is set by the scale factor around the central location and is reduced dimensions by orthogonal triangle decomposition. Finally, the location and size of the target are updated by the maximum of the response. By testing 13 challenging video sequences taken by the drone, the results show that the algorithm has adaptability to the changes in the target scale and its robustness along with many other performance indicators are both better than the most state-of-the-art methods in illumination Variation, fast motion, motion blur and other complex situations.
Zhicheng LU Zhizheng LIANG Lei ZHANG Jin LIU Yong ZHOU
Inspired from the idea of data representation in manifold learning, we derive a novel model which combines the original training images and their tangent vectors to represent each image in the testing set. Different from the previous methods, the L1 norm is used to control the reconstruction error. Considering the fact that the objective function in the proposed model is non-smooth, we utilize the majorization minimization (MM) method to solve the proposed optimization model. It is interesting to note that at each iteration a quadratic optimization problem is formulated and its analytical solution can be achieved, thereby making the proposed algorithm effective. Extensive experiments on face images demonstrate that our method achieves better performance than some previous methods.
Haiqiang LIU Gang HUA Hongsheng YIN Aichun ZHU Ran CUI
Compressed sensing is an effective compression algorithm. It is widely used to measure signals in distributed sensor networks (DSNs). Considering the limited resources of DSNs, the measurement matrices used in DSNs must be simple. In this paper, we construct a deterministic measurement matrix based on Gordon-Mills-Welch (GMW) sequence. The column vectors of the proposed measurement matrix are generated by cyclically shifting a GMW sequence. Compared with some state-of-the-art measurement matrices, the proposed measurement matrix has relative lower computational complexity and needs less storage space. It is suitable for resource-constrained DSNs. Moreover, because the proposed measurement matrix can be realized by using simple shift register, it is more practical. The simulation result shows that, in terms of recovery quality, the proposed measurement matrix performs better than some state-of-the-art measurement matrices.
Wen LI Shi-xiong XIA Feng LIU Lei ZHANG
Much research which has shown the usage of social ties could improve the location predictive performance, but as the strength of social ties is varying constantly with time, using the movement data of user's close friends at different times could obtain a better predictive performance. A hybrid Markov location prediction algorithm based on dynamic social ties is presented. The time is divided by the absolute time (week) to mine the long-term changing trend of users' social ties, and then the movements of each week are projected to the workdays and weekends to find the changes of the social circle in different time slices. The segmented friends' movements are compared to the history of the user with our modified cross-sample entropy to discover the individuals who have the relatively high similarity with the user in different time intervals. Finally, the user's historical movement data and his friends' movements at different times which are assigned with the similarity weights are combined to build the hybrid Markov model. The experiments based on a real location-based social network dataset show the hybrid Markov location prediction algorithm could improve 15% predictive accuracy compared with the location prediction algorithms that consider the global strength of social ties.
Fengying MA Yankai YIN Wei CHEN
The distinctive characteristics of unmanned aerial vehicle networks (UAVNs), including highly dynamic network topology, high mobility, and open-air wireless environments, may make UAVNs vulnerable to attacks and threats. Due to the special security requirements, researching in the high reliability of the power and communication network in drone monitoring system become special important. The reliability of the communication network and power in the drone monitoring system has been studied. In order to assess the reliability of the system power supply in the drone emergency monitoring system, the accelerated life tests under constant stress were presented based on the exponential distribution. Through a comparative analysis of lots of factors, the temperature was chosen as the constant accelerated stress parameter. With regard to the data statistical analysis, the type-I censoring sample method was put forward. The mathematical model of the drone monitoring power supply was established and the average life expectancy curve was obtained under different temperatures through the analysis of experimental data. The results demonstrated that the mathematical model and the average life expectancy curve were fit for the actual very well. With overall consideration of the communication network topology structure and network capacity the improved EED-SDP method was put forward in drone monitoring. It is concluded that reliability analysis of power and communication network in drone monitoring system is remarkably important to improve the reliability of drone monitoring system.
Hao WANG GaoJun LIU Jianyong DUAN Lei ZHANG
Existing studies on transportation mode detection from global positioning system (GPS) trajectories mainly adopt handcrafted features. These features require researchers with a professional background and do not always work well because of the complexity of traffic behavior. To address these issues, we propose a model using a sparse autoencoder to extract point-level deep features from point-level handcrafted features. A convolution neural network then aggregates the point-level deep features and generates a trajectory-level deep feature. A deep neural network incorporates the trajectory-level handcrafted features and the trajectory-level deep feature for detecting the users' transportation modes. Experiments conducted on Microsoft's GeoLife data show that our model can automatically extract the effective features and improve the accuracy of transportation mode detection. Compared with the model using only handcrafted features and shallow classifiers, the proposed model increases the maximum accuracy by 6%.